Combining Constituent Parsers

نویسندگان

  • Victoria Fossum
  • Kevin Knight
چکیده

Combining the 1-best output of multiple parsers via parse selection or parse hybridization improves f-score over the best individual parser (Henderson and Brill, 1999; Sagae and Lavie, 2006). We propose three ways to improve upon existing methods for parser combination. First, we propose a method of parse hybridization that recombines context-free productions instead of constituents, thereby preserving the structure of the output of the individual parsers to a greater extent. Second, we propose an efficient lineartime algorithm for computing expected f-score using Minimum Bayes Risk parse selection. Third, we extend these parser combination methods from multiple 1-best outputs to multiple n-best outputs. We present results on WSJ section 23 and also on the English side of a Chinese-English parallel corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Integration of Syntactic Parsing and Semantic Role Labeling

This paper describes a system for the CoNLL-2005 Shared Task on Semantic Role Labeling. We trained two parsers with the training corpus in which the semantic argument information is attached to the constituent labels, we then used the resulting parse trees as the input of the pipelined SRL system. We present our results of combining the output of various SRL systems using different parsers.

متن کامل

Exploiting Diversity in Natural Language Processing: Combining Parsers

Three state-of-the-art statistical parsers are combined to produce more accurate parses, as well as new bounds on achievable Treebank parsing accuracy. Two general approaches are presented and two combination techniques are described for each approach. Both parametric and non-parametric models are explored, i The resulting parsers surpass the best previously published performance results for th...

متن کامل

A Comparison of Chinese Parsers for Stanford Dependencies

Stanford dependencies are widely used in natural language processing as a semanticallyoriented representation, commonly generated either by (i) converting the output of a constituent parser, or (ii) predicting dependencies directly. Previous comparisons of the two approaches for English suggest that starting from constituents yields higher accuracies. In this paper, we re-evaluate both methods ...

متن کامل

Discontinuity (Re)2visited: A Minimalist Approach to Pseudoprojective Constituent Parsing

In this paper, we use insights from Minimalist Grammars (Keenan and Stabler, 2003) to argue for a context-free approximation of discontinuous structures that is both easy to parse for state-of-the-art dynamic programming constituent parsers and has a simple and effective method for the reconstruction of discontinuous tree structures. The results achieved on the Tiger treebank – paired with stat...

متن کامل

Why is German Dependency Parsing More Reliable than Constituent Parsing?

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used [ , ]. Another direction ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009